54 research outputs found

    Speech-evoked brain activity is more robust to competing speech when it is spoken by someone familiar: Speech representations for familiar voices

    Get PDF
    When speech is masked by competing sound, people are better at understanding what is said if the talker is familiar compared to unfamiliar. The benefit is robust, but how does processing of familiar voices facilitate intelligibility? We combined high-resolution fMRI with representational similarity analysis to quantify the difference in distributed activity between clear and masked speech. We demonstrate that brain representations of spoken sentences are less affected by a competing sentence when they are spoken by a friend or partner than by someone unfamiliar—effectively, showing a cortical signal-to-noise ratio (SNR) enhancement for familiar voices. This effect correlated with the familiar-voice intelligibility benefit. We functionally parcellated auditory cortex, and found that the most prominent familiar-voice advantage was manifest along the posterior superior and middle temporal gyri. Overall, our results demonstrate that experience-driven improvements in intelligibility are associated with enhanced multivariate pattern activity in posterior temporal cortex

    Intelligibility benefit for familiar voices is not accompanied by better discrimination of fundamental frequency or vocal tract length

    Get PDF
    Speech is more intelligible when it is spoken by familiar than unfamiliar people. If this benefit arises because key voice characteristics like perceptual correlates of fundamental frequency or vocal tract length (VTL) are more accurately represented for familiar voices, listeners may be able to discriminate smaller manipulations to such characteristics for familiar than unfamiliar voices. We measured participants’ (N = 17) thresholds for discriminating pitch (correlate of fundamental frequency, or glottal pulse rate) and formant spacing (correlate of VTL; ‘VTL-timbre’) for voices that were familiar (participants’ friends) and unfamiliar (other participants’ friends). As expected, familiar voices were more intelligible. However, discrimination thresholds were no smaller for the same familiar voices. The size of the intelligibility benefit for a familiar over an unfamiliar voice did not relate to the difference in discrimination thresholds for the same voices. Also, the familiar-voice intelligibility benefit was just as large following perceptible manipulations to pitch and VTL-timbre. These results are more consistent with cognitive accounts of speech perception than traditional accounts that predict better discrimination

    The Benefit to Speech Intelligibility of Hearing a Familiar Voice

    Get PDF
    Previous experience with a voice can help listeners understand speech when a competing talker is present. Using the coordinate-response measure task (Bolia, Nelson, Ericson, & Simpson, 2000), Johnsrude et al. (2013) demonstrated that speech is more intelligible when either the target or competing (masking) talker is a long-term spouse than when both talkers are unfamiliar (termed familiar-target and familiar-masker benefits, respectively). To better understand how familiarity improves intelligibility, we measured the familiar-target and familiar-masker benefits in older and younger spouses using a more challenging matrix task, and compared the benefits listeners gain from spouses' and friends' voices. On each trial, participants heard two sentences from the Boston University Gerald (Kidd, Best, & Mason, 2008) corpus (" ") and reported words from the sentence beginning with a target name word. A familiar-masker benefit was not observed, but all groups showed a robust familiar-target benefit and its magnitude did not differ between spouses and friends. The familiar-target benefit was not influenced by relationship length (in the range of 1.5-52 years). Together, these results imply that the familiar-target benefit can develop from various types of relationships and has already reached a plateau around 1.5 years after meeting a new friend

    Pupil Dilation Is Sensitive to Semantic Ambiguity and Acoustic Degradation

    Get PDF
    Speech comprehension is challenged by background noise, acoustic interference, and linguistic factors, such as the presence of words with more than one meaning (homonyms and homophones). Previous work suggests that homophony in spoken language increases cognitive demand. Here, we measured pupil dilation—a physiological index of cognitive demand—while listeners heard high-ambiguity sentences, containing words with more than one meaning, or well-matched low-ambiguity sentences without ambiguous words. This semantic-ambiguity manipulation was crossed with an acoustic manipulation in two experiments. In Experiment 1, sentences were masked with 30-talker babble at 0 and +6 dB signal-to-noise ratio (SNR), and in Experiment 2, sentences were heard with or without a pink noise masker at –2 dB SNR. Speech comprehension was measured by asking listeners to judge the semantic relatedness of a visual probe word to the previous sentence. In both experiments, comprehension was lower for high- than for low-ambiguity sentences when SNRs were low. Pupils dilated more when sentences included ambiguous words, even when no noise was added (Experiment 2). Pupil also dilated more when SNRs were low. The effect of masking was larger than the effect of ambiguity for performance and pupil responses. This work demonstrates that the presence of homophones, a condition that is ubiquitous in natural language, increases cognitive demand and reduces intelligibility of speech heard with a noisy background

    Attentional Modulation of Envelope-Following Responses at Lower (93–109 Hz) but Not Higher (217–233 Hz) Modulation Rates

    Get PDF
    Directing attention to sounds of different frequencies allows listeners to perceive a sound of interest, like a talker, in a mixture. Whether cortically generated frequency-specific attention affects responses as low as the auditory brainstem is currently unclear. Participants attended to either a high- or low-frequency tone stream, which was presented simultaneously and tagged with different amplitude modulation (AM) rates. In a replication design, we showed that envelope-following responses (EFRs) were modulated by attention only when the stimulus AM rate was slow enough for the auditory cortex to track—and not for stimuli with faster AM rates, which are thought to reflect ‘purer’ brainstem sources. Thus, we found no evidence of frequency-specific attentional modulation that can be confidently attributed to brainstem generators. The results demonstrate that different neural populations contribute to EFRs at higher and lower rates, compatible with cortical contributions at lower rates. The results further demonstrate that stimulus AM rate can alter conclusions of EFR studies.This work was supported by funding from the Canadian Institutes of Health Research (CIHR; Operating Grant: MOP 133450) and the Natural Sciences and Engineering Research Council of Canada (NSERC; Discovery Grant: 327429-2012). Authors R.P. Carlyon and H.E. Gockel were supported by intramural funding from the Medical Research Council [SUAG/007 RG91365]

    A sound-sensitive source of alpha oscillations in human non-primary auditory cortex

    Get PDF
    The functional organization of human auditory cortex can be probed by characterizing responses to various classes of sound at different anatomical locations. Along with histological studies this approach has revealed a primary field in posteromedial Heschl's gyrus (HG) with pronounced induced high-frequency (70-150 Hz) activity and short-latency responses that phase-lock to rapid transient sounds. Low-frequency neural oscillations are also relevant to stimulus processing and information flow, however their distribution within auditory cortex has not been established. Alpha activity (7-14 Hz) in particular has been associated with processes that may differentially engage earlier versus later levels of the cortical hierarchy, including functional inhibition and the communication of sensory predictions. These theories derive largely from the study of occipitoparietal sources readily detectable in scalp electroencephalography. To characterize the anatomical basis and functional significance of less accessible temporal-lobe alpha activity we analyzed responses to sentences in seven human adults (four female) with epilepsy who had been implanted with electrodes in superior temporal cortex. In contrast to primary cortex in posteromedial HG, a non-primary field in anterolateral HG was characterized by high spontaneous alpha activity that was strongly suppressed during auditory stimulation. Alpha-power suppression decreased with distance from anterolateral HG throughout superior temporal cortex, and was more pronounced for clear compared to degraded speech. This suppression could not be accounted for solely by a change in the slope of the power spectrum. The differential manifestation and stimulus-sensitivity of alpha oscillations across auditory fields should be accounted for in theories of their generation and function.SIGNIFICANCE STATEMENTTo understand how auditory cortex is organized in support of perception, we recorded from patients implanted with electrodes for clinical reasons. This allowed measurement of activity in brain regions at different levels of sensory processing. Oscillations in the alpha range (7-14 Hz) have been associated with functions including sensory prediction and inhibition of regions handling irrelevant information, but their distribution within auditory cortex is not known. A key finding was that these oscillations dominated in one particular non-primary field, anterolateral Heschl's gyrus, and were suppressed when subjects listened to sentences. These results build on our knowledge of the functional organization of auditory cortex and provide anatomical constraints on theories of the generation and function of alpha oscillations

    Repetition Enhancement for Frequency-Modulated but Not Unmodulated Sounds: A Human MEG Study

    Get PDF
    BACKGROUND: Decoding of frequency-modulated (FM) sounds is essential for phoneme identification. This study investigates selectivity to FM direction in the human auditory system. METHODOLOGY/PRINCIPAL FINDINGS: Magnetoencephalography was recorded in 10 adults during a two-tone adaptation paradigm with a 200-ms interstimulus-interval. Stimuli were pairs of either same or different frequency modulation direction. To control that FM repetition effects cannot be accounted for by their on- and offset properties, we additionally assessed responses to pairs of unmodulated tones with either same or different frequency composition. For the FM sweeps, N1m event-related magnetic field components were found at 103 and 130 ms after onset of the first (S1) and second stimulus (S2), respectively. This was followed by a sustained component starting at about 200 ms after S2. The sustained response was significantly stronger for stimulation with the same compared to different FM direction. This effect was not observed for the non-modulated control stimuli. CONCLUSIONS/SIGNIFICANCE: Low-level processing of FM sounds was characterized by repetition enhancement to stimulus pairs with same versus different FM directions. This effect was FM-specific; it did not occur for unmodulated tones. The present findings may reflect specific interactions between frequency separation and temporal distance in the processing of consecutive FM sweeps
    corecore